Managing resources for multiple trial distributed processing tasks

ABSTRACT

A computer-implemented method of managing resources for multiple trial distributed processing tasks is presented. The method includes estimating an expected time needed to process each of a set of mask patterns which can be independently processed. The method further includes allocating each of the set of mask patterns to a set of processing cores in accordance with the expected time, and processing the mask patterns in accordance with the allocation, when the computer in invoked to estimate, allocate, and process.

CROSS-REFERENCES TO RELATED APPLICATIONS

This application claims priority, under 35 U.S.C. § 119(e), from U.S.Provisional Application No. 62/414,650, filed on Oct. 28, 2016, entitled“MANAGING RESOURCES FOR MULTIPLE TRIAL DISTRIBUTED PROCESSING TASKS”,the content of which is incorporated herein by reference in itsentirety.

BACKGROUND

The present disclosure relates generally to managing computer resources,and in particular, to a computer implemented method for managingresources for multiple trial distributed processing tasks.

Semiconductor devices are manufactured using a series of process stepsincluding the use of photomasks in photolithography to image patternsonto semiconductor masks and wafers. As semiconductor devices arereduced in size, the process of imaging patterns onto semiconductormasks and wafers is affected by a variety of optical effects includingoptical diffraction.

These optical effects can be compensated for through variouscomputationally intensive approaches. For example, multiple testpatterns may be simulated and the results can be analyzed by performingoptical proximity correction (OPC) on those results. For anotherexample, different parameters may be utilized in multiple independenttrials for a given image to compensate for optical effects. That is, OPCis a technique for enhancing photolithography to compensate for imageerrors due to optical diffraction or other effects. As a result,modifications can be made to specific patterns within a mask tocompensate for these types of optical effects, thereby improving theresulting imaged wafers during test or production.

OPC involves a set of computationally intensive tasks that can take daysto perform using advanced computers. For example, multiple test patternsare typically simulated on a single mask, each test pattern can beanalyzed separately, allowing for the use of distributing processing toreduce the time needed to perform OPC for a given mask. Alternatively,multiple portions of a test layout may be analyzed independently ormultiple trials may be performed independently using distributedprocessing.

With recent technology advances, the processing of mask data takesconsiderable computer resources. Therefore, there is a need for reducingthe amount of computer resources needed for the processing of mask data.

SUMMARY

According to one embodiment of the present invention, acomputer-implemented method of managing resources for multiple trialdistributed processing tasks is presented. The method includesestimating an expected time needed to process each of a set of maskpatterns which can be independently processed. The method furtherincludes allocating each of the set of mask patterns to a set ofprocessing cores in accordance with the expected time, and processingthe mask patterns in accordance with the allocation, when the computerin invoked to estimate, allocate, and process.

According to one embodiment, the set of mask patterns are a set oftemplates. According to one embodiment, the set of templates areprocessed by the set of processing cores in multiple trials withdifferent simulation parameters utilized in each trial.

According to one embodiment, the method further includes allocating eachof the set of templates to each of the set of processing cores prior toprocessing a first one of the set of templates for a first trial.According to one embodiment, the method further includes reallocatingeach of the set of templates to each of the set of processing coresafter the first trial. According to one embodiment, the method furtherincludes allocating each of the set of templates as the templates areprocessed by the set of processing cores.

According to one embodiment of the present invention, a computer programproduct for managing resources for multiple trial distributed processingtasks is presented. The computer program product includes a computerreadable storage medium having program instructions embodied therewith.The program instructions are executable by a processing circuit to causethe device to perform a method that includes estimating an expected timeneeded to process each of a set of mask patterns which can beindependently processed, allocating each of the set of mask patterns toa set of processing cores in accordance with the expected time, andprocessing the mask patterns in accordance with the allocation, when thecomputer in invoked to estimate, allocate, and process.

According to one embodiment of the present invention, a data processingsystem for managing resources for multiple trial distributed processingtasks is presented. The data processing system includes a set ofprocessing cores, and a memory storing program instructions. Whenexecuted by the processor, the program instructions execute the steps ofestimating an expected time needed to process each of a set of maskpatterns which can be independently processed, and allocating each ofthe set of mask patterns to the set of processing cores in accordancewith the expected time. When executed by the processor, the programinstructions further execute the steps of processing the mask patternsin accordance with the allocation, when the data processing system isinvoked to estimate, allocate, and process.

A better understanding of the nature and advantages of the embodimentsof the present invention may be gained with reference to the followingdetailed description and the accompanying drawings.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 depicts a simple exemplary block diagram of a data processingsystem in which embodiments of the present invention may be implemented.

FIG. 2 depicts a simple exemplary block diagram of a data processingenvironment in which embodiments of the present invention may beimplemented.

FIGS. 3A-3C depict simple exemplary diagrams of a portion of a maskwhich may be used by embodiments of the present invention.

FIG. 4 depicts a simple exemplary diagram of multiple distributed coresfor analyzing multiple trials of multiple templates, in accordance withone embodiment of the present invention.

FIG. 5 depicts a simple exemplary first flow diagram for optimizingprocessing resource usage, in accordance with one embodiment of thepresent invention.

FIGS. 6A-6D depict simple exemplary illustrations of a processing list,in accordance with one embodiment of the present invention.

FIG. 7 depicts a simple exemplary flow diagram for optimizing processingresource usage while running multiple trials, in accordance with oneembodiment of the present invention.

FIG. 8 depicts a simple exemplary graph showing the effects of adding orsubtracting processing cores when processing a trial, in accordance withembodiments of the present invention.

DETAILED DESCRIPTION

Processes and devices may be implemented and utilized for managingcomputer resources for multiple trial distributed processing tasks.These processes and apparatuses may be implemented and utilized as willbe explained with reference to the various embodiments below.

FIG. 1 depicts a simple exemplary block diagram of a data processingsystem 100 in which embodiments of the present invention may beimplemented. Data processing system 100 is one example of a suitabledata processing system and is not intended to suggest any limitation asto the scope of use or functionality of the embodiments describedherein. Regardless, data processing system 100 is capable of beingimplemented and/or performing any of the functionality set forth hereinsuch as managing resources for multiple trial distributed processingtasks.

In data processing system 100 there is a computer system/server 112,which is operational with numerous other general purpose or specialpurpose computing system environments, peripherals, or configurations.Examples of well-known computing systems, environments, and/orconfigurations that may be suitable for use with computer system/server112 include, but are not limited to, personal computer systems, servercomputer systems, thin clients, thick clients, hand-held or laptopdevices, multiprocessor systems, microprocessor-based systems, set topboxes, programmable consumer electronics, network PCs, minicomputersystems, mainframe computer systems, and distributed cloud computingenvironments that include any of the above systems or devices, and thelike.

Computer system/server 112 may be described in the general context ofcomputer system executable instructions, such as program modules, beingexecuted by a computer system. Generally, program modules may includeroutines, programs, objects, components, logic, data structures, and soon that perform particular tasks or implement particular abstract datatypes. Computer system/server 112 may be practiced in distributedcomputing environments where tasks are performed by remote processingdevices that are linked through a communications network. In adistributed computing environment, program modules may be located inboth local and remote computer system storage media including memorystorage devices.

Computer system/server 112 in data processing system 100 is depicted inthe form of a general-purpose computing device. The components ofcomputer system/server 112 may include, but are not limited to, one ormore processors or processing units 116, a system memory 128, and a bus118 that couples various system components including system memory 128to processor 116.

Bus 118 represents one or more of any of several types of busstructures, including a memory bus or memory controller, a peripheralbus, an accelerated graphics port, and a processor or local bus usingany of a variety of bus architectures. By way of example, and notlimitation, such architectures include Industry Standard Architecture(ISA) bus, Micro Channel Architecture (MCA) bus, Enhanced ISA (EISA)bus, Video Electronics Standards Association (VESA) local bus, andPeripheral Component Interconnects (PCI) bus.

Computer system/server 112 typically includes a variety ofnon-transitory computer system usable media. Such media may be anyavailable media that is accessible by computer system/server 112, and itincludes both volatile and non-volatile media, removable andnon-removable media.

System memory 128 can include non-transitory computer system readablemedia in the form of volatile memory, such as random access memory (RAM)130 and/or cache memory 132. Computer system/server 112 may furtherinclude other non-transitory removable/non-removable,volatile/non-volatile computer system storage media. By way of example,storage system 134 can be provided for reading from and writing to anon-removable, non-volatile magnetic media (not depicted and typicallycalled a “hard drive”). Although not depicted, a USB interface forreading from and writing to a removable, non-volatile magnetic chip(e.g., a “flash drive”), and an optical disk drive for reading from orwriting to a removable, non-volatile optical disk such as a CD-ROM,DVD-ROM or other optical media can be provided. In such instances, eachcan be connected to bus 118 by one or more data media interfaces. Memory128 may include at least one program product having a set (e.g., atleast one) of program modules that are configured to carry out thefunctions of the embodiments. Memory 128 may also include data that willbe processed by a program product.

Program/utility 140, having a set (at least one) of program modules 142,may be stored in memory 128 by way of example, and not limitation, aswell as an operating system, one or more application programs, otherprogram modules, and program data. Each of the operating system, one ormore application programs, other program modules, and program data orsome combination thereof, may include an implementation of a networkingenvironment. Program modules 142 generally carry out the functionsand/or methodologies of the embodiments. For example, a program modulemay be software for managing resources for multiple trial distributedprocessing tasks.

Computer system/server 112 may also communicate with one or moreexternal devices 114 such as a keyboard, a pointing device, a display124, and the like and the like; one or more devices that enable a userto interact with computer system/server 112; and/or any devices (e.g.,network card, modem, and the like) that enable computer system/server112 to communicate with one or more other computing devices. Suchcommunication can occur via I/O interfaces 122 through wired connectionsor wireless connections. Still yet, computer system/server 112 cancommunicate with one or more networks such as a local area network(LAN), a general wide area network (WAN), and/or a public network (e.g.,the Internet) via network adapter 120. As depicted, network adapter 120communicates with the other components of computer system/server 112 viabus 118. It should be understood that although not depicted, otherhardware and/or software components could be used in conjunction withcomputer system/server 112. Examples, include, but are not limited to:microcode, device drivers, tape drives, RAID systems, redundantprocessing units, data archival storage systems, external disk drivearrays, and the like

FIG. 2 depicts a simple exemplary block diagram of a data processingenvironment 200 in which embodiments of the present invention may beimplemented. Data processing environment 200 is a network of dataprocessing systems such as described above with reference to FIG. 1.Software applications such as for managing resources for multiple trialdistributed processing tasks may execute on any computer or other typeof data processing system in data processing environment 200. Dataprocessing environment 200 includes network 210. Network 210 is themedium used to provide simplex, half duplex and/or full duplexcommunications links between various devices and computers connectedtogether within data processing environment 200. Network 210 may includeconnections such as wire, wireless communication links, or fiber opticcables.

Server 220 and client 240 are coupled to network 210 along with storageunit 230. In addition, laptop 250 and facility 280 (such as a home orbusiness) are coupled to network 210 including wirelessly such asthrough a network router 253. A mobile phone 260 may be coupled tonetwork 210 through a mobile phone tower 262. Data processing systems,such as server 220, client 240, laptop 250, mobile phone 260 andfacility 280 contain data and have software applications includingsoftware tools executing thereon. Other types of data processing systemssuch as personal digital assistants (PDAs), smartphones, tablets andnetbooks may be coupled to network 210.

Server 220 may include software application 224 and data 226 formanaging resources for multiple trial distributed processing tasks orother software applications and data in accordance with embodimentsdescribed herein. Storage 230 may contain software application 234 and acontent source such as data 236 for managing resources for multipletrial distributed processing tasks. Other software and content may bestored on storage 230 for sharing among various computer or other dataprocessing devices. Client 240 may include software application 244 anddata 246. Laptop 250 and mobile phone 260 may also include softwareapplications 254 and 264 and data 256 and 266. Facility 280 may includesoftware applications 284 and data 286. Other types of data processingsystems coupled to network 210 may also include software applications.Software applications could include a web browser, email, or othersoftware application for managing resources for multiple trialdistributed processing tasks.

Server 220, storage unit 230, client 240, laptop 250, mobile phone 260,and facility 280 and other data processing devices may couple to network210 using wired connections, wireless communication protocols, or othersuitable data connectivity. Client 240 may be, for example, a personalcomputer or a network computer.

In the depicted example, server 220 may provide data, such as bootfiles, operating system images, and applications to client 240 andlaptop 250. Server 220 may be a single computer system or a set ofmultiple computer systems working together to provide services in aclient server environment. Client 240 and laptop 250 may be clients toserver 220 in this example. Client 240, laptop 250, mobile phone 260 andfacility 280 or some combination thereof, may include their own data,boot files, operating system images, and applications. Data processingenvironment 200 may include additional servers, clients, and otherdevices that are not depicted.

In the depicted example, data processing environment 200 may be theInternet. Network 210 may represent a collection of networks andgateways that use the Transmission Control Protocol/Internet Protocol(TCP/IP) and other protocols to communicate with one another. At theheart of the Internet is a backbone of data communication links betweenmajor nodes or host computers, including thousands of commercial,governmental, educational, and other computer systems that route dataand messages. Of course, data processing environment 200 also may beimplemented as a number of different types of networks, such as forexample, an intranet, a local area network (LAN), or a wide area network(WAN). FIG. 2 is intended as an example, and not as an architecturallimitation for the different illustrative embodiments.

Among other uses, data processing environment 200 may be used forimplementing a client server environment in which the embodiments may beimplemented. A client server environment enables software applicationsand data to be distributed across a network such that an applicationfunctions by using the interactivity between a client data processingsystem and a server data processing system. Data processing environment200 may also employ a service oriented architecture where interoperablesoftware components distributed across a network may be packagedtogether as coherent business applications.

FIGS. 3A-3C depict simple exemplary diagrams of a portion of a mask 300which may be used by embodiments of the present invention. These couldbe portions of a test mask or templates on a production or test mask orsimulated as such. A template is a type of mask pattern useful forperforming proximity correction tasks. Other types of mask patterns caninclude double and multi-patterning images or other actual or simulatedfeatures on a mask which can be useful in semiconductor lithographyoperations. Imaging as described herein can include physical imaging orsimulated imaging. In FIG. 3A, a portion of mask 300 includes twenty(20) test patterns imaged by a mask including test pattern 310 and testpattern 320. These test patterns may be imaged in one specific area of amask or in various locations across the mask or their images may besimulated. In addition, many more test patterns may be imaged onto asingle mask including multiple copies of the same test pattern invarious locations of the mask to identify changes caused by imaginglocation. Each test pattern is often separated from other test patternson the mask to avoid any interference between templates. However,multiple templates could be grouped together in a full layout testpattern. With a full layout approach, each template can have a context(proximity effects for geometries near the template boundary) so it canbe simulated with knowledge of neighboring features. In alternativeembodiments, selected portions of a test pattern may be analyzedseparately and independently for different effects.

In this example, each test pattern can include specific patterns foridentifying or otherwise characterizing certain lithography effects,which can then be optimized to improve the resulting test pattern orsubsequent production patterns. For example, when magnified, one can seethat test pattern 310 includes a series of parallel lines as depicted inFIG. 3B. Specific points within these parallel lines can be measured toidentify specific effects of lithography such as diffraction. Theeffects can then be minimized or otherwise optimized by makingadjustments to the test patterns or subsequent production patterns ofmasks used to image these types of patterns. For another example, whenmagnified, one can see that test pattern 320 of FIG. 3C includes aseries of short lines with curves for identifying other effects oflithography. Each test pattern can be used to test certain effects oflithography. These test patterns can be commonly used across manycompanies, although many test patterns may be proprietary to a specificcompany.

A test layout can include one or more test patterns including a varietyof test patterns or specific test elements that are not separated fromeach other, but are interconnected, adjacent to each other, or otherwisecommingled. Each test pattern in a test layout can typically be analyzedusing characterization and optimization tasks including performingrepeated trials with different inputs. As a result, each test patternmay generally be analyzed separately and independently from other testpatterns, thereby allowing for parallel processing of each test pattern.In addition, subsections of a given test pattern can be identified forseparate and independent analysis, allowing for parallel processing ofeach identified test pattern subsection. A template is a test layout,test pattern or test pattern subsection which can be analyzed separatelyand independently from other templates using parallel processing.Templates can be created by software tools so they can be processedindependently. Templates can be created with context from surroundingtemplates so that they can be processed independently but with knowledgeabout their environments.

At a higher level, repeated or multiple trials of templates imaged on amask can be arranged with different inputs so that each trial can beanalyzed separately and independently from other trials (e.g., anumerical aperture set to 0.6 or 0.8), thereby allowing for parallelprocessing of each trial. In addition, each trial may be decomposed bytemplate into multiple independent trials for parallel processing. Suchis the case for optimizing optical proximity correction (OPC) in whichit is desirable to evaluate alternative OPC treatments such as differentnumerical apertures, particle coherence, and the like

In summary, multiple trials may be performed on templates derived from amask whereby each trial of each template can be performed independentlythrough parallel processing. This can result in hundreds or moreindependent trials of templates for characterization and optimizationthrough parallel processing analysis.

FIG. 4 depicts a simple exemplary diagram of multiple distributed cores400 for analyzing multiple trials of multiple templates, in accordancewith one embodiment of the present invention. In this example, there isa managing core 405 with memory 406 and ten (10) processing cores410-419 illustrated that are interconnected with a network 408. Eachprocessing core can be a separate server or computer, a processor or setof processors within a server, a processor within a parallel processingsystem, and the like Each processing core can process data independentlyfrom the other processing cores. Also each processing core utilizesmemory 420-429 that includes software 430-439 for processing data.

Alternatively, a shared memory 440 and a shared software image 445 canbe utilized by processing cores 410-419, each processing core utilizinga separate instance of the shared software image. In anotheralternative, the software image may be stored in memory 406 of managingcore 405. Depending on the type of software license agreement, thesoftware may be limited to a specific set of processing cores, a certainnumber of processing cores at any one time, or other types oflimitations. As a result, the number of processing cores available foranalyzing trials across multiple templates in parallel may be limited bythe number of physical processing cores (e.g., processors) available,the number of software licenses available, or the number of processingcores that can process data concurrently under a given software licenseagreement. The number of processing resources is the number ofprocessing cores that have software licenses for processing trials oftemplates concurrently.

In a simple case, there may be only 2 trials for 20 different templatesrequiring 40 different computational processes of images, each templatecapable of being processed or otherwise analyzed independently for eachtrial. Also in this simple case, there are 40 processing cores, eachprocessing core having a software license for performing the desiredanalysis, thereby resulting in 40 processing resources. Each of the 20templates for both trials can then be analyzed by one or 40 processingresources concurrently. In such a case, the time needed for processingboth trials of all the templates will be the time to process a singletrial for the longest running template.

Typically the time needed for running a trial varies by template. Thatis, in the simple case described above, the time needed for processing afirst trial for a first template will be similar in length to processinga second separate trial on that same template. However, the time neededfor processing the first trial on a second template will likely bedifferent from the first trial on the first template. However,processing the second trial on the second template is usually relativein length to the first trial on that same template since the variationin processing trials between templates typically varies significantlybased on a variety of factors such as template complexity. Templatecomplexity can vary due to template size, the number of templatevertices, and the like

If the number of processing resources is reduced, then the trials foreach template or the number of processing resources for each trial needto be allocated. FIG. 5 depicts a simple exemplary first flow diagram500 for optimizing processing resource usage, in accordance with oneembodiment of the present invention. This process can be performed priorto allocating templates to processing cores for analyzing, or it may beperformed at any time afterwards to reallocate templates to processingcores for analyzing such as described below with reference to FIG. 7.Referring again to FIG. 5, there are M processing cores and N templateentries with N greater that M. If N is less than or equal to M then eachtemplate could be processed by a single processing core for each trial.

In a first step 510, an estimated time for processing each template isdetermined. This can be determined by performing a test run of eachtemplate using a simplified trial, processing a first trial on eachtemplate, or by examining each template and generating an estimation ofprocessing time based on template complexity. Alternative embodimentsmay use historical information from prior processing of each template onother masks. If this process is being re-run after several trials havebeen processed, then an average, median or other measure may be used todetermine an estimated time for processing each template for theremaining trials. Then in step 520, the template entries are ordered byestimated processing time into a processing list such as described belowwith reference to FIG. 6A-6D. That is, the template entries are orderedfrom the fastest estimated processing time to the slowest estimatedprocessing time (or from slowest to fastest).

Subsequently, in step 530, the fastest estimated processing time isadded to the second fastest estimated processing time so that thefastest and second fastest template entries (by processing) will beprocessed by the same processing core. Then in step 540 the fastestestimated processing time entry can then be removed from the processinglist as it has been added to the second fastest estimated processingtime. Then in step 550, it is determined whether the number ofprocessing cores (M) equals the number of template entries on theprocessing list (N). If not, then processing returns to step 520,otherwise this process ceases and the templates can be analyzed inaccordance with the processing list for each trial.

In an alternative embodiment, each template may be an entry in aprocessing list and allocated to a specific processing core. Theallocation is similarly performed using the method described above withreference to FIG. 5, but with a link for each template entry to aprocessing core such that one or more processing cores may have multipletemplates allocated to it. Many other types of processing lists or otherdata representations can be utilized to implement the process describedabove.

FIGS. 6A-6D depict simple exemplary illustrations of a processing list,in accordance with one embodiment of the present invention. In thissimplified example, there are 6 templates (referred to as templates Athrough F) and 3 processing cores so that three templates need to beallocated to processing cores with other templates in this embodiment.In FIG. 6A, a processing list 600 is depicted with templates A-F, eachtemplate depicted with an expected time for execution in an arbitraryunit of time. The expected time can be in terms of time, processingcycles, or any other unit of measure which could be utilized forrepresenting the relative time needed for processing that template for agiven trial. The expected time could also be normalized. FIG. 6A isdepicted with the templates already ordered by processing time as wouldbe performed in step 520 above. Templates B and E are the two fastesttemplates and should be processed by the same processing core serially.

In FIG. 6B, an updated processing list 610 is depicted with templates Band E combined for processing on a common processing core serially. Theentries have been reordered by time of processing. In this example, Ehas a run time of 3 and B a run time of 4, so the combination would havea run time of 7. Since there are 5 entries in processing list 610, thefastest two entries should be combined, which would be template D andtemplate F for processing by the same processing core serially.

In FIG. 6C, an updated processing list 620 is depicted with templates Dand F combined for processing on a common processing core serially. Theentries have been reordered by time of processing. In this example, Fhas a run time of 5 and D a run time of 6, so the combination would havea run time of 11. Since there are 4 entries in processing list 620, thefastest two entries should be combined, which would be templates B, Eand template C for processing by the same processing core serially.

In FIG. 6D, an updated processing list 630 is depicted with templates B,E and C combined for processing on a common processing core serially.The entries have been reordered by time of processing. In this example,B and E have a combined run time of 7 and C a run time of 9, so thecombination would have a run time of 16. Since all 6 of the templateshave now been allocated into 3 lines in processing list 630, each linebeing associated with one of the 3 different processing cores, there isno more need for further template allocation.

FIG. 7 depicts a simple exemplary flow diagram 700 for optimizingprocessing resource usage while running multiple trials, in accordancewith one embodiment of the present invention. In this case there are Ptrials to be run for N templates on M processing cores. As a result,there will be a total of P×N analyses of templates with M processingcores. In a first step 710, the templates are allocated for processingby the processing cores as described above with reference to FIG. 5.This generates an initial allocation of templates for a given trialacross the processing cores. Alternatively, the templates may beallocated randomly or in some fixed pattern across the processing coresbefore proceeding. The process described above with reference to FIG. 5may be performed after the first trial is completed. In a second step720, counter C is set to 1. In a third step 730, all templates for trialC are analyzed using the current allocation of the templates for eachcore processor.

Then in steps 740 and 750, it is determined whether the currentallocation of templates for each processing core needs adjusting. Thiscan be determined by comparing how quickly each template was processedin the previous trial and the overall trial runtime, the past fewtrials, or all trials since this process was initiated in step 740, thendetermining in step 750 whether that comparative analysis is within anacceptable predetermined threshold. This analysis can be performed bytemplate or by processing core. The predetermined threshold may be, forexample, the largest runtime penalty specified by the user for reducingthe number of cores for a trial. If an adjustment is needed, thenprocessing proceeds to step 760, otherwise processing continues to step770.

In step 760, the templates are reallocated for processing by theprocessing cores as described above with reference to FIG. 5. Thisgenerates an updated allocation of templates for a given trial acrossthe processing cores. That is, the templates are reallocated utilizingthe analysis from step 740. Processing then continues to step 770. Instep 770, counter C is incremented by 1. In step 780, if counter C nowexceeds the number of trials P, then processing come to an end,otherwise processing returns to step 730.

FIG. 8 depicts a simple exemplary graph 800 showing the effects ofadding or subtracting processing cores when processing a trial, inaccordance with embodiments of the present invention. In thisillustration, the time needed to analyze each template is normalized sothat the longest time to process any of the templates for a given trialis set to a value of one. This illustration also includes 20 differenttemplates requiring analysis of all 20 templates for a given trial. Thisgraph can be generated using the method described above with referenceto FIG. 5 multiple times while varying the number of processing coresavailable when analyzing a given number of templates (20 in thisexample).

Since the longest normalized time needed for analyzing a template isone, then the shortest time that all templates can be analyzed is alsoone. Clearly this can occur when the number of processing cores usedequals the number of templates as depicted at point 810 of the graph.However, some combinations may be utilized without extending the timeneeded to analyze all 20 templates as depicted at point 820 of thegraph. As a result, if time is of the essence, all 20 templates could beanalyzed for a given trial using only 15 processing cores withoutnoticeably increasing overall trial time. Given that there is typicallya cost savings in reducing the number of processing cores, such analysiswould be beneficial. These costs savings include a reduction in hardwareused and software used, both of which incur costs such as hardwarepurchases and software licensing.

If time is not of the essence and a cost benefit analysis could beperformed, then graph 800 illustrates how such an analysis could beutilized. For example, one processing core (hardware and software) maycost a certain amount and reducing the amount of time to process a giventrial (or set of trials) may have a certain benefit by reducing chipdesign cycle time. The costs and benefits could be compared to determinethat an optimum number of processing cores is 7 with a normalized timeof 2 as depicted at point 830. As a result, only 7 cores would be neededto analyze 20 templates in such an example. Of course, the exact numberwould vary by entity needing the OPC analysis. In addition, the graphdepicted could vary significantly depending on the normalized timesneeded to analyze templates in a given example.

The above described techniques for managing resources for multiple trialdistributed processing tasks may also be utilized for processing othertypes of mask patterns. For example, the above described techniquescould be utilized for decomposition of multiple patterning such asdouble patterning technology (DPT).

The present invention may be a system, a method, and/or a computerprogram product. The computer program product may include a computerreadable storage medium (or media) having computer readable programinstructions thereon for causing a processor to carry out aspects of thepresent invention.

The computer readable storage medium can be a tangible device that canretain and store instructions for use by an instruction executiondevice. The computer readable storage medium may be, for example, but isnot limited to, an electronic storage device, a magnetic storage device,an optical storage device, an electromagnetic storage device, asemiconductor storage device, or any suitable combination of theforegoing. A non-exhaustive list of more specific examples of thecomputer readable storage medium includes the following: a portablecomputer diskette, a hard disk, a random access memory (RAM), aread-only memory (ROM), an erasable programmable read-only memory (EPROMor Flash memory), a static random access memory (SRAM), a portablecompact disc read-only memory (CD-ROM), a digital versatile disk (DVD),a memory stick, a floppy disk, a mechanically encoded device such aspunch-cards or raised structures in a groove having instructionsrecorded thereon, and any suitable combination of the foregoing. Acomputer readable storage medium, as used herein, is not to be construedas being transitory signals per se, such as radio waves or other freelypropagating electromagnetic waves, electromagnetic waves propagatingthrough a waveguide or other transmission media (e.g., light pulsespassing through a fiber-optic cable), or electrical signals transmittedthrough a wire.

Computer readable program instructions described herein can bedownloaded to respective computing/processing devices from a computerreadable storage medium or to an external computer or external storagedevice via a network, for example, the Internet, a local area network, awide area network and/or a wireless network. The network may comprisecopper transmission cables, optical transmission fibers, wirelesstransmission, routers, firewalls, switches, gateway computers and/oredge servers. A network adapter card or network interface in eachcomputing/processing device receives computer readable programinstructions from the network and forwards the computer readable programinstructions for storage in a computer readable storage medium withinthe respective computing/processing device.

Computer readable program instructions for carrying out operations ofthe present invention may be assembler instructions,instruction-set-architecture (ISA) instructions, machine instructions,machine dependent instructions, microcode, firmware instructions,state-setting data, or either source code or object code written in anycombination of one or more programming languages, including an objectoriented programming language such as Smalltalk, C++ or the like, andconventional procedural programming languages, such as the “C”programming language or similar programming languages. The computerreadable program instructions may execute entirely on the user'scomputer, partly on the user's computer, as a stand-alone softwarepackage, partly on the user's computer and partly on a remote computeror entirely on the remote computer or server. In the latter scenario,the remote computer may be connected to the user's computer through anytype of network, including a local area network (LAN) or a wide areanetwork (WAN), or the connection may be made to an external computer(for example, through the Internet using an Internet Service Provider).In some embodiments, electronic circuitry including, for example,programmable logic circuitry, field-programmable gate arrays (FPGA), orprogrammable logic arrays (PLA) may execute the computer readableprogram instructions by utilizing state information of the computerreadable program instructions to personalize the electronic circuitry,in order to perform aspects of the present invention.

Aspects of the present invention are described herein with reference toflowchart illustrations and/or block diagrams of methods, apparatus(systems), and computer program products according to embodiments of theinvention. It will be understood that each block of the flowchartillustrations and/or block diagrams, and combinations of blocks in theflowchart illustrations and/or block diagrams, can be implemented bycomputer readable program instructions.

These computer readable program instructions may be provided to aprocessor of a general purpose computer, special purpose computer, orother programmable data processing apparatus to produce a machine, suchthat the instructions, which execute via the processor of the computeror other programmable data processing apparatus, create means forimplementing the functions/acts specified in the flowchart and/or blockdiagram block or blocks. These computer readable program instructionsmay also be stored in a computer readable storage medium that can directa computer, a programmable data processing apparatus, and/or otherdevices to function in a particular manner, such that the computerreadable storage medium having instructions stored therein comprises anarticle of manufacture including instructions which implement aspects ofthe function/act specified in the flowchart and/or block diagram blockor blocks.

The computer readable program instructions may also be loaded onto acomputer, other programmable data processing apparatus, or other deviceto cause a series of operational steps to be performed on the computer,other programmable apparatus or other device to produce a computerimplemented process, such that the instructions which execute on thecomputer, other programmable apparatus, or other device implement thefunctions/acts specified in the flowchart and/or block diagram block orblocks.

The flowchart and block diagrams in the Figures illustrate thearchitecture, functionality, and operation of possible implementationsof systems, methods, and computer program products according to variousembodiments of the present invention. In this regard, each block in theflowchart or block diagrams may represent a module, segment, or portionof instructions, which comprises one or more executable instructions forimplementing the specified logical function(s). In some alternativeimplementations, the functions noted in the block may occur out of theorder noted in the figures. For example, two blocks depicted insuccession may, in fact, be executed substantially concurrently, or theblocks may sometimes be executed in the reverse order, depending uponthe functionality involved. It will also be noted that each block of theblock diagrams and/or flowchart illustration, and combinations of blocksin the block diagrams and/or flowchart illustration, can be implementedby special purpose hardware-based systems that perform the specifiedfunctions or acts or carry out combinations of special purpose hardwareand computer instructions.

A data processing system suitable for storing and/or executing programcode will include at least one processor coupled directly or indirectlyto memory elements through a system bus. The memory elements can includelocal memory employed during actual execution of the program code, bulkstorage media, and cache memories, which provide temporary storage of atleast some program code in order to reduce the number of times code mustbe retrieved from bulk storage media during execution.

A data processing system may act as a server data processing system or aclient data processing system. Server and client data processing systemsmay include data storage media that are computer usable, such as beingcomputer readable. A data storage medium associated with a server dataprocessing system may contain computer usable code such as for managingresources for multiple trial distributed processing tasks. A client dataprocessing system may download that computer usable code, such as forstoring on a data storage medium associated with the client dataprocessing system, or for using in the client data processing system.The server data processing system may similarly upload computer usablecode from the client data processing system such as a content source.The computer usable code resulting from a computer usable programproduct embodiment of the illustrative embodiments may be uploaded ordownloaded using server and client data processing systems in thismanner.

Input/output or I/O devices (including but not limited to keyboards,displays, pointing devices, and the like) can be coupled to the systemeither directly or through intervening I/O controllers.

Network adapters may also be coupled to the system to enable the dataprocessing system to become coupled to other data processing systems orremote printers or storage devices through intervening private or publicnetworks. Modems, cable modem and Ethernet cards are just a few of thecurrently available types of network adapters.

The above embodiments of the present invention are illustrative and notlimiting. Various alternatives and equivalents are possible. Inaddition, the technique and system of the present invention is suitablefor use with a wide variety of electronic design automation (EDA) toolsand methodologies for designing, testing, and/or manufacturing. Thescope of the invention should, therefore, be determined not withreference to the above description, but instead should be determinedwith reference to the pending claims along with their full scope orequivalents.

What is claimed is:
 1. A computer-implemented method of managingresources for multiple trial distributed processing tasks to facilitatefabrication of a semiconductor integrated circuit, the methodcomprising: receiving imaging data representative of each of a set ofmask patterns defined within corresponding portions of a test mask, andwherein each of the set of mask patterns are configured to compensatefor optical effects in the fabrication of the semiconductor integratedcircuit; estimating an expected time needed to process each of the setof mask patterns which can be independently processed; allocating eachof the set of mask patterns to a processing core within a set ofprocessing cores in accordance with the expected time, whereinallocating each of the set of mask patterns comprises: identifying atleast two mask patterns of the set of mask patterns having a shortestexpected time needed to process; and allocating the identified at leasttwo mask patterns to a first processing core within the set ofprocessing cores; and wherein each of the set of processing cores isallocated at least one mask pattern of the set of mask patterns; andprocessing each of the set of mask patterns in accordance with theallocation.
 2. The method of claim 1, wherein the set of mask patternsare a set of templates.
 3. The method of claim 2, wherein the set oftemplates are processed by the set of processing cores in multipletrials with different simulation parameters utilized in each trial. 4.The method of claim 3 further comprising allocating each of the set oftemplates to each of the set of processing cores prior to processing afirst one of the set of templates for a first trial.
 5. The method ofclaim 4 further comprising reallocating each of the set of templates toeach of the set of processing cores after the first trial.
 6. The methodof claim 3 further comprising allocating each of the set of templates asthe templates are processed by the set of processing cores.
 7. Themethod of claim 6 further comprising reallocating each of the set oftemplates to each of the set of processing cores after the first trial.8. A computer program product for managing resources for multiple trialdistributed processing tasks to facilitate fabrication of asemiconductor integrated circuit, the computer program productcomprising a computer readable storage medium having programinstructions embodied therewith, the program instructions executable bya processing circuit to cause the device to perform a method comprising:receiving imaging data representative of each of a set of mask patternsdefined within corresponding portions of a test mask, and wherein eachof the set of mask patterns are configured to compensate for opticaleffects in the fabrication of the semiconductor integrated circuit;estimating an expected time needed to process each of a set of maskpatterns which can be independently processed; allocating each of theset of mask patterns to a processing core within a set of processingcores in accordance with the expected time, wherein allocating each ofthe set of mask patterns comprises: identifying at least two maskpatterns of the set of mask patterns having a shortest expected timeneeded to process; and allocating the identified at least two maskpatterns to a first processing core within the set of processing cores;and wherein each of the set of processing cores is allocated at leastone mask pattern of the set of mask patterns; and processing each of theset of mask patterns in accordance with the allocation.
 9. The method ofclaim 8, wherein the set of mask patterns are a set of templates. 10.The method of claim 9, wherein the set of templates are processed by theset of processing cores in multiple trials with different simulationparameters utilized in each trial.
 11. The method of claim 10 furthercomprising allocating each of the set of templates to each of the set ofprocessing cores prior to processing a first one of the set of templatesfor a first trial.
 12. The method of claim 11 further comprisingreallocating each of the set of templates to each of the set ofprocessing cores after the first trial.
 13. The method of claim 10further comprising allocating each of the set of templates as thetemplates are processed by the set of processing cores.
 14. A dataprocessing system for managing resources for multiple trial distributedprocessing tasks to facilitate fabrication of a semiconductor integratedcircuit, the data processing system comprising: a set of processingcores; and a memory storing program instructions which when executed bythe processor execute the steps of: receiving imaging datarepresentative of each of a set of mask patterns defined withincorresponding portions of a test mask, and wherein each of the set ofmask patterns are configured to compensate for optical effects in thefabrication of the semiconductor integrated circuit; estimating anexpected time needed to process each of a set of mask patterns which canbe independently processed; allocating each of the set of mask patternsto a processing core within the set of processing cores in accordancewith the expected time, wherein allocating each of the set of maskpatterns comprises: identifying at least two mask patterns of the set ofmask patterns having a shortest expected time needed to process; andallocating the identified at least two mask patterns to a firstprocessing core within the set of processing cores; and wherein each ofthe set of processing cores is allocated at least one mask pattern ofthe set of mask patterns; and processing each of the set of maskpatterns in accordance with the allocation.
 15. The method of claim 14,wherein the set of mask patterns are a set of templates.
 16. The methodof claim 15, wherein the set of templates are processed by the set ofprocessing cores in multiple trials with different simulation parametersutilized in each trial.
 17. The method of claim 16 further comprisingallocating each of the set of templates to each of the set of processingcores prior to processing a first one of the set of templates for afirst trial.
 18. The method of claim 17 further comprising reallocatingeach of the set of templates to each of the set of processing coresafter the first trial.
 19. The method of claim 16 further comprisingallocating each of the set of templates as the templates are processedby the set of processing cores.
 20. The method of claim 19 furthercomprising reallocating each of the set of templates to each of the setof processing cores after the first trial.